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Abstract 

Dichotomous IRT models can be viewed as families of stochastically ordered 
distributions of responses to test items. This paper explores several properties of 
such distributions. More in particular, it is examined under what conditions 
stochastic order in families of conditional distributions is transferred to their inverse 
distributions, from two families of related distributions to a third family, or from 
multivariate conditional distributions to a marginal distribution. The main results are 
formulated as two theorems which immediately apply to dichotomous IRT models. 
One theorem holds for unidimensional models with fixed item parameters. The 
other theorem holds for models with multiple abilities or with random item 
parameters as used, for example, in adaptive testing. 
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Stochastic Order in Dichotomous Item Response Theory 

Suppose an educational or psychological test consists of a set of n dichotomously 
scored items indexed by Responses to item i are denoted by a random 

variable Uj which takes the value 1 for a correct response and the value 0 
otherwise. In addition, it is assumed that examinees respond to the test items on 
the basis of an ability which can be represented by a (latent) unidimensional 
variable 0 . Item response theory (IRT) offers various stochastic models to analyze 
the responses of examinees to the test items. Basic treatments of IRT are given, 
for example, in Hambleton and Swaminathan (1985) and Lord (1980). 

Three different ways are available to represent an item response model. 
The first representation uses the idea of a response function to model the 
probabilities by which an examinee responds to an item. Let Prob{Uj=1 |0 } be the 
probability that an examinee with ability level 0 produces a correct response to 
the item, and let p f ( 0 ) be defined as the two-parameter logistic (2-PL) function 

Pj(0) = [1 + exp(-aj(0-bj))] , -«<0<oo -co<bj<oo 3j>0, 

( 1 ) 



where bj and aj are usually interpreted as the difficulty and discriminating power of 
item i, respectively. Then, 



Prob{Uj=1 |0} = Pi (0) = [1 +exp(-a i (0-b i ))r 1 (2) 

is an example of the response function representation of an IRT model. 
Alternatives to the two-parameter logistic model are the more sparsely 
parameterized Rasch or one-parameter logistic (1-PL) (Fischer & Molenaar, 1995) 
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and the Birnbaum or three-parameter logistic (3-PL) model (Hambleton & 
Swaminathan, 1985; Lord, 1985). Throughout this paper, when we refer to IRT any 
of these three response models is implied. The response function representation is 
standard in introductory texts to IRT This popularity is due to the fact that it allows 
for an immediate graphical interpretation of the values of the item parameters. For 
dichotomously scored responses only a function for the correct response needs to 
be specified; the function for the incorrect response, 1 -Pj(0 ), is automatically fixed. 

A somewhat more involved representation is based on the idea of a 
(parametric) family of probability mass functions (pmfs) for the distribution of Uj. 
This family can be denoted as {fj(u f |0): -o^e^}, where 



and Pj(0) is defined by (1). This representation focusses on the conditional 
probability distribution of Uj given 6. It is standard in texts on the statistical 
treatment of the estimation of the values of the item and/or ability parameters. Its 
product over the items and examinees gives the likelihood function associated with 
a set of test data. 

The final representation is the one of a (parametric) family of cumulative 
distribution functions (cdfs) (Fj(Uj |0 ): -oo^coo), where 



and fj(y |0) is given by (3). This representation is the one addressed in the current 
paper. In particular, the interest is in the property of stochastic order in families of 
cdfs as (4). 




(3) 




(4) 
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It is important to note the subtle differences between the first and the third 
representation. Though the logistic function itself has a well-established reputation 
as a cdf in certain applications, it is not used as a cdf in the first representation-let 
alone as a family of such functions. Second, the logistic function in (2) is 
monotonically increasing in 0 whereas the family of cdfs in (4) is nonincreasing in 
0 for Uj=0,1. Though potentially confusing, these two properties of monotonicity 
are closely related via a well-known theorem in statistics reviewed below. 

This paper shares the interests in stochastic order in response variables 
with several other papers which treat IRT from a nonparametric perspective. Some 
useful references are: Ellis and van den Wollenberg (1993); Grayson (1988); 
Holland (1981, 1990); Holland and Rosenbaum (1986); Huynh (1994); Junker 
(1991, 1993); Mokken (1971, in press); Mokken and Lewis (1982); Molenaar (in 
press); Ramsay (1991, in press); Rosenbaum (1984, 1985); Stout (1987, 1990); 
Sijtsma (1988); Sijtsma and Junker (1994); and Sijtsma and Meijer (1992). 
However, our point of view is fully parametric. Nevertheless, it is believed to be 
useful to study the consequences of certain minimal sets of assumption on 
response functions even if the abilities of the examinees or the properties of the 
items are estimated under a parametric model as in (2). This study may help to 
reveal certain structures in the data with otherwise might have gone unnoticed. 
Several examples of such structures are discussed at the end of the paper. 
Knowledge of such structures can, in turn, suggest new diagnostics with respect to 
violations of basic assumptions underlying the model. 

The early work of Mokken (1971) as well as the follow up by Holland and 
Rosenbaum (1986) and Rosenbaum (1984, 1985) deserve special mention. These 
authors derived an important result for conditional covariances between item 
response variables from the assumptions of conditionally independent and 
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associated items with monotonic response functions. In fact, their result is stronger 
than one of the results we derive. On the other hand, our interest is also in the 
regression of (functions of) some response variables on (functions of) other 
response variables as well as in generalization of the results to multidimensional 
response functions and tests with random item parameters as used, for example, 
in computerized adaptive testing. The body of this paper, however, consists of a 
systematic treatment of the notion of stochastic order in families of (dichotomous) 
(multivariate) random variables such as defined in (4). Several properties of these 
families will be introduced as a series of lemmas with proofs. The main results are 
then formulated as two theorems which follow immediately from the lemmas. One 
theorem holds for the conventional case of a unidimensional test with a fixed 
design. The other theorem specifies the conditions under which the results hold if 
the ability structure underlying the test is multidimensional or the test items are 
randomly assigned to the examinees. The final section discusses the application of 
the results to the analysis of data obtained through from several fixed and random 
test designs, including a well-known adaptive testing design. 

Stochastic Order 

The definition of a family of random variables stochastically ordered in a parameter 
is given in many textbooks (e.g., Lehmann, 1986). The same holds for the result 
that the expected value of a (monotonic) function of stochastically ordered 
variables is increasing in the parameter. A more comprehensive treatment of the 
notion of stochastic order typically lacks. Because of the relevance of the concept 
of stochastic order for dichotomous IRT, this section of the paper tries to fill the 
void. In particular, it examines under what conditions stochastic order is transferred 
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to a family of inverse distributions (that is, distributions in which the random 
variable and parameter change their status) or from two given families of 
distributions to a third family. Then a few properties of stochastic order in families 
of multivariate (conditional) cdfs will be presented. The results will be applied to 
IRT in a later section. 

For simplicity, the same notation will be used for all pdfs and cdfs as well 
as for all algebraic functions used in the treatment. Also, without explicit mention it 
is assumed that all pdfs and expectations exist. Finally, to avoid complications due 
to densities equal to zero for some values of the random variables all definitions 
and results are assumed to be formulated only for the supports of their pdfs. 

Definition 1 (Monotone likelihood ratio) . A family of (conditional) density functions 
(f(ylx)} has a monotone likelihood ratio (MLR) in y w.r.t x if for any x^Xg 



does not decrease in y (e.g., Lehmann, 1986, p. 78). 

Note that to obtain generality the likelihood ratio is not required to be 
strictly increasing in y. The same relaxation is present in the following definition. 

Definition 2 (Stochastic order) . A family of random variables (Ylx) is stochastic 
ordered (SO) in x if, for all y, its cumulative distribution functions, (F(ylx)), do not 
increase in x (e.g., Lehmann, 1986, p. 84). 

As an important consequence of the fact that no strict order is required in 
the definition of SO, it holds that (Y | x) is SO if X and Y are independent. This 
implication will be used when we discuss Lemma 10 below. 



f(yh) 



f (y|xo) 
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Observe that both definitions formalize the same idea of a random 
variable tending to produce larger values if another variable does the same. 
However, MLR is stronger than SO (see Lemma 4 below). MLR is a useful 
property in statistical inference whereas the assumption of SO is often made in 
statistical modeling because it is weaker and has the nice graphical interpretation 
of a family of cdfs being similarly ordered across all possible parameter values for 
each possible value of its argument. 

It should be noted that though the property of SO seems to imply that the 
two variables have positive correlation, this suggestion is misleading. Positive 
correlation between variables would involve such properties as symmetry (positive 
correlation of X with Y implies correlation of Y with X), transitivity (positive 
correlation of X with Y and Y with Z implies positive correlation of X with Z) as well 
as correlation between two variables induced by a common covariate. As shown 
below, such properties do not hold for SO. 

Expected Values 

Note that if the above two definitions hold, they also hold for X and/or Y replaced 
by nondecreasing functions <p-|(X) and 92 ^)- The well-known Lemmas 1 and 2 
below are based on a multivariate version of this property. 

Lemma 1 . Let (Y|lx; i=1 n} be independently distributed with densities f(yjlx) and 

let <p(y-| y n ) be a function not decreasing in any yj. If {f(yjlx)} has MLR in y ; 

w.r.t. x for all i, then E[ <p(Y -| Y n )lx] is a nondecreasing function of x (Lehmann, 

1986, p. 85, Lemma 2(i)). 

Lemma 2 . Under the same conditions as in Lemma 1, if (Yjlx; i=1 n} is SO in x, 
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then E[<p(Y -j Y n )lx] is a nondecreasing function of x (Lehmann, 1986, p. 85, 

Lemma 2(ii)). 

Note that Lemmas 1 and 2 imply that for a single random variable Y the 
expected value E[Ylx] is a nondecreasing function of x under the conditions given. 
This property is frequently used in the proofs of the lemmas presented below. 

Inverse Distributions 

The question can be raised under what conditions the properties of MLR and SO 
for a family of conditional variables, {Y | x}, imply MLR and/or SO for the inverse 
family, (X | y}. As it turns out, MLR is always symmetric but SO is not. However, 
an exception is the case of dichotomous (functions of) random variables for which 
the two properties coincide and symmetry of SO is implied. This case is important 
for the treatment of SO in dichotomous IRT models. The results are summarized 
as follows: 

Lemma 3 . (f(ylx 1 ,x 2 )} has MLR in y w.r.t. x 1 if and only if (f(x 1 ly,x 2 )} has MLR in 
x 1 w.r.t. y for all x 2 . 

Proof . Chen, Chuang and Novick (1981, Theorem 1) offer a version of this lemma 
without the conditioning variable x 2 . Following their argument, for any x 1 '>x 1 , if 
y >y, then the following inequalities are equivalent: 

f(y h ,x 2 ) ^ f(y |x-| ', x 2 ) 
f (y h,x 2 ) “ f(y|x 1 ,x 2 ) ’ 
f(y’h',x 2 ) ^ f(y |x 1 , x 2 ) 
f (y|x-| ,x 2 ) “ f(y|x 1 ,x 2 ) ’ 
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f (y>i',x 2 ) f (vl x 2) > %>i.x 2 ) f (y|x 2 ) 

f (y |xi ,X 2 ) f(y |X 2 ) " f(y|x-|, X 2 ) f(y |x 2 )‘ 

Multiplying the left-hand and right-hand side by f(xi '|x 2 )/f(x-| '|x 2 ) and 
<( x i |x 2 )/f(xi |x 2 ), respectively, gives: 

f ( x i |y', x 2 ) ^ f(x 1 |y',x 2 ) 

f(xi |y,x 2 ) ~ f(x 1 |y, x 2 ) ' 

□ 

Though this property of symmetry seems to support the intuition of MLR 
as ‘positive correlation’ between two variables in the sense that the events of high 
(low) values on two variables tend to occur simultaneously, it is easy to show by 
counterexample that this intuition is not valid for SO. 

Lemma 4 . If (f(ylx 1 ,x 2 )} has MLR in y w.r.t. x v then {Ylx^} is SO in x 1 and 
{X 1 ly.Xg) is SO in y. 

Proof - From Lehmann (1986, p. 85, Lemma 2(ii)) it follows that the assumption 
guarantees that (Ylx 1 ,x 2 ) is SO in x y The fact that (X 1 ly,x 2 ) is SO in y then 
follows from Lemma 3. □ 

In the following lemmas, a variable or function is called dichotomous if it 
can take two distinct values. 

Lemma 5 . If Y is dichotomous, then (f(ylx 1 ,x 2 )} has MLR in y w.r.t. x 1 if and only if 
{Ylx 1 ,x 2 ) is SO in x.|. 

Proof . Let Y have possible values y and y, with y’>y. Then for any x 1 ’>x 1 the 
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following inequalities are equivalent: 

f(y’|xi',x 2 ) ^ f(y |x-| x 2 ) 
f (y l x i, x 2 ) " f(y|x 1t x 2 ) ’ 

1 -f(y |x-| ,x 2 ) ^ f(y |x-| x 2 ) 

1 -f(y |xi, x 2 ) ~ f(y|x 1 ,x 2 ) ’ 

f(y|xi, x 2 )>f(y|x 1 ' 1 x 2 ) , 



and 



F(y |x-|, x 2 )> F(y [x-j , x 2 ). 

Since F(y lx 1 ,x 2 )=F(y lx-| ,x 2 )=1 , the required result follows. □ 

Lemma 6 . If Y is dichotomous and {Y|x 1 ,x 2 } is SO in x v then {X 1 ly,x 2 } is SO in y. 
Proof . Lemmas 3 and 5. □ 

Lemma 7 . If X is dichotomous, (f(ylx)} has MLR in y w.r.t. x if and only if {Xly} is 
SO in y. 

Proof . Lemmas 3 and 5. □ 

Transfer of Stochastic Order 

Suppose three families of conditional distributions are given which are related to 
each other because they share a common variable. Linder what conditions does 
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SO for two of the families transfer to the third family? 

Lemma 8 . Let {Zly} and {Ylx} be SO in y and x, respectively. Then {Zlx} is SO in x 
if Z and X are independent given Y=y. 

Proof. It holds that 



F(zly) is decreasing in y and {Ylx} is SO in x. It follows from Lemma 2 that F(zlx) 
decreases in x, and thus that {Zlx} is SO. o 

Lemma 9 . If {Ylx} and {Zlx} are SO in x, then {Zly} is SO in y if y is dichotomous 
and Z and Y are independent given X=x. 

Proof . Lemmas 6 and 8. □ 

The example in Table 1 shows that SO is not transitive. Because 





Jf(z|y,x)f(y|x)dy 



Thus, 



F(z|x) = jF(z|y)f(y|x)dy. 



[Table 1 about here] 
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P(Y=1 | X=1)=0. 30/0.50 < P(Y=1 | X=2)=0.20/0.50 and P(Z=1 | Y=1 )=0.30/0.50 < 
P(Z=1 | Y=2)=0.25/0.50, it follows that F Y j x (y|1) > F Y | x (y|2) and F Z | y (z|1) 
F Z |y(z|2) for all y and z, respectively. However, F Z | X (1 11) < F Z | X (1 12) because 
P(Z=1 | X=1)=0.25/0.50 < P(Z=1 | X=2)=0.30/0.50. 

It does not hold generally that {Z | y) is SO in y if {Y | x} and {Z | x) are. 
By symmetry, (Ylz) would also be SO in z, which contradicts the earlier conclusion 
that SO is not symmetric. Thus the intuitive notion of two variables correlating 
positively if they have a 'common covariate' does not apply here either. 

Multivariate Conditioning Variables 

Families of conditional distributions with more than one conditioning variable are 
introduced and the question is raised if the property of SO is maintained if the 
transition to a single conditioning variable is made. The question is relevant for the 
treatment of stochastic order in IRT models for multivariate abilities or when an 
item parameter becomes stochastic and the model implies stochastic order w.r.t. 
this parameter as well. For simplicity, only the case of two conditioning variables is 
discussed but generalization to larger numbers of conditioning variables is readily 
obtained. 

The family {Y | x^} is defined to be SO in x 1 and x 2 if F(y | x^Xp) is 
nondecreasing in x 1 for all x 2 and in x 2 for all x^ The following lemma identifies a 
condition under which SO is transferred to {Y | x^}: 

Lemma 10 . Let Y be a continuous random variable with density function f(y). 
Further, (Ylx 1 ,x 2 ) is assumed to be SO in x 1 and x 2 . Then (Ylx^ is SO in x 1 if 
{X 2 lx.|} is SO in x.|. 
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Proof . The lemma is proved as follows: 



f(y|xi) = Jf(y,x 2 |x 1 )dx 2 

Thus, = /f(y|xi,X 2 )f(x 2 |xi)dx 2 . 



F (y l x i) = jF(y |x-,.x 2 )f(x 2 |x 1 )dx 2 . 

Since F(ylx 1 ,x 2 ) is decreasing in x 2 and {X 2 lx 1 } is SO in it follows from Lemma 
2 that {Ylx 1 } is SO in x^ 0 



Note that the fact that condition of {X 2 | being SO in x 1 implies that 
Lemma 10 holds if X 1 and X 2 are independent. A direct proof of this implication 
can be based on the fact that under independence 

F (y |x-|) = jF(y |x 1 ,x 2 )f(x 2 |x 1 )dx 2 = Jf^Ix^)^^. Since F(ylx 1 ,x 2 ) and 
f(y) are continuous, f(y) does not change sign, and Jf(y)dy converges by 
definition, the weighted mean-value theorem for integrals (Apostol, 1967, sect. 
3.19) shows that there exist a constant c such that F(y |x-| ) = F(y |x-| ,c)J*f(x 2 )dx 2 
= F(y|x 1 ,c), which, by assumption, is decreasing in x 1 . 

The lemma thus shows that to proceed from a multivariate to a marginal 
condition, the multivariate condition has to demonstrate SO itself. The lemma is 
also given in van der Linden and Vos (in press). Note that for X 1 and X 2 being 
independent, Lemma 8 is a special case of Lemma 10. 



Multivariate Distributions 

A multivariate family of random variables {Y 1 Y | x) is defined to be SO in x if 

{F(y 1 y n | x)} does not increase in x for all (y 1 y p ). 
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The example in Table 2 shows that SO in a series ot families of univariate 
[Table 2 about here] 

distribution functions does not imply multivariate SO. For example, Fy^ | x (0 10) = 
(0.25+0.1 5)/0.50 > Fy^| x (0|1) = (0.10+0.00)/0.50. The same relation holds for 
f Y 2 |x(°I x )- However, F Yl ,Y 2 |x(1*°! 0 ) = 0.05/0.50 < F Yi ,y 2 | x (1.0|1) = 
0.10/0.50. 

The following lemma identifies a condition under which multivariate SO 
does follow from univariate SO: 

Lemma 11 . If each {Yjlx}, i=1 n, is SO in x, then {Y 1 Y n lx) is SO in x if {Yjlx}, 

i=1 n, are independent. 

Proof . The lemma follows immediately from the fact that the univariate cdfs are 
nonnegative and do not increase in x. □ 

The reverse implication, however, does hold generally: 

Lemma 12 . If (Y^ Y n | x} is SO in x, then any subset of variables is SO in x. 

Proof . A proof will be given for the case of two variables. For any x >x, 

F (yi.Y2 l x ) 2 F (yi -Y2 ! x ) 



for all values of (y^, y 2 ). Thus, 
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F(y i |x ) = *im y 2 — ^ oo F(y 1 ,y 2 |x ) 

^ coF(yi,y 2 |x) 

= F(yi |x) 

for all values of y^ □ 

Functions of Random Variables 

The following lemma summarizes several results for (multivariate) functions of 
random variables with the property of stochastic order in a common conditioning 
variable: 

Lemma 13 . Let (Yjlx), i =1 n, be independent and SO in x, and let 

91 =<Pl(yi y p ). 92 = ( P 2 (yp +1 yq) a nd 93 = ( P 3 (yq+i y n ). 0 <p<q<n, be 

nondecreasing in each of their arguments. If 93 is (1) dichotomous or a (2) 
nondecreasing function of ijC ^ y, with each yj dichotomous, it holds that: 

(1) {<> 1 ,02,93 l x ) is SO in x; 

(2) {<t>-|, <t>2 1 93} is SO in 93 ; 

(3) (9j I 93 ), j=1,2, is SO in 93 ; 

(4) { 4 >j |x, 9 k }, j * k =1 3, is SO in x for all values of 9 ^; 

(5) { X 1 9 j, 93 }, j= 1 ,2, is SO in 93 for all values of 9 j ; 

( 6 ) (4>j l<Pk*93 }. j*k=1,2, is SO in 93 for all values of 9 ^. 

Proof . The parts of the lemma are proved as follows: 

0) (4>j|x), j=1 3, are independent and SO in x. Hence, Lemma 11 gives 

the required result. 



O 
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(2) For the cdf of the joint conditional distribution, it holds that 

F (9l.92l93) = J F( < pi,cp 2 |q> 3 ,x)f(x|<p 3 )d x 
= / F(cp i,<P2|x)f(x|9 3 )dx. 

From Lemmas 13(1) and 12 it follows that F((p-j ,q >2 |x) does not increase 
in x and that { <> 3 1 x } is SO in x. If 93 is dichotomous, Lemma 6 shows 
th^ { X | tpg } is SO in 93 . If 93 is an nondecreasing function of 
I y j , it follows from Lemma 4 together with Grayson’s (1988; see 
i! ( & + ^uynh, 1994) result of MLR for the family of density functions 
associated with Xj n _q + 1 Y j |x that { X J 93 } is also SO in x. In either case, 
Lemma 2 gives us the desired result. 

(3) Lemmas 13(2) and 12. 

(4) As <)>j and <|> k are independent given x, 

F(9j|x,9 k ) = F( 9 j|x), 

and the result follows immediately. 

(5) Lemmas 13(4) and 6 . 

( 6 ) It holds that 

F (9j|9k>93) = / F (<Pj.x|tp k ,<P3)dx 

= / F( 9 j jx)f(x| 9 k , 9 3 )dx. 

By assumption F( <pj |x ) is not increasing in x whereas Lemma 13(5) 
shows that (X 19 ^ 93 } is SO in 93 for all values of 9 k . Thus, Lemma 2 
gives the desired result. □ 
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Main Theorems 

The main results for the conditional expectations and covariances between 

functions on variables {Yj | x} and {Yj | x^}, i=1 n, are formulated in the 

following two theorems. 

Theorem 1 . Let {Yjlx}, i=1 n, be independent and SO in x, and let 

91 =<Pl(yi y p ), <P2 = <P2(yp+1 yq) and <P3=<P3(yq+1 yn)- 0<p<q<n, be 

nondecreasing in each of their arguments. If 93 is (1) dichotomous or a (2) 
nondecreasing function of £j_g +1 yj with each yj dichotomous, then: 

(1) E(<|)j jq>3 ), j=1 ,2, is a nondecreasing function of 93; 

(2) Cov(0i ,02 193 ) - 0; 

(3) Cov((|>j,<j) k ) > 0, j,k=1 3;j*k. 

Proof . The three parts of the theorem are proved as follows: 

(1) Lemmas 13(3) and 2. 

(2) Note that 



Cov(9 1l( )>2l93) = Cov((<t> 1t E(<t> 2 |<t>i)|93)). 

Let t( 9 -j , 93 ) = E(ct >2 l9l -93^ • Lemmas 13(6) and 2 show that t is a 
nondecreasing function of 93 . It is now to be proved that 
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Cov(0-| t x(0-| ,<p 3 ) |<p 3 ) = t(«> 1 .<P3)|‘P3]-E(0 1 |(P3)E(x(0 1 ,<P3)|<P3) 

= E[(<j>i - E(0i ))x(0i ,<P3) |<p 3 ] 

> 0. 

Following an argument in Casella and Berger (1990, sect. 4.7.2), 



E[(0 1 -E(0i ))x(<f> i ,<P3) [ q>3) 

= E[(01 -E(<J>i ))x(0i ,<P3) l (-oo,0)( < t>l -E(0i )) 1 03 ] 

+ E[(<t»i - E(0i )>x(0i ,q>3) l[o,«»)(01 -E(<J>l)) 1 03] 

> E[(<J>i -E^ ))t(E(0i |<P3).<P3)l(_oo t o)(0i -E(<J»i )) I 93 ] 

+ E[<t> 1 -E(<J) 1 ))t(E(<t) 1 |tP3),tp3)l[0,oo)(<J>i -E(<t>i )) |<P3] 

= x (E(0i |<P3),<P3)E(c})i -E(t}>i ) |tp3) 

= 0 . 

(3) It holds that 

Cov( 0 j, 0 k ) = E(Cov( 0 j, 0 k | X)) + Cov(E( 0 j |X), E( 0 k ( X)). 

From the previous part of the theorem it follows that the first term is the 
expected value of a nonnegative statistic. As E(0;|X) and E(<J> k j X) are 
nondecreasing in X, a repetition of the argument in the previous part of 
this proof shows that the second term is nonnegative. □ 

It is important to observe that all three implications in Theorem 1 address 
properties of regression and covariance functions which can be observed in large 
samples. We will return to this point in the last section when applications to IRT 
are discussed more directly. Individual parts of the theorem can be found in other 
places in the psychometric literature. However, they were established using 
different methods of proof than the one based on the set of lemmas derived 
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above. The covariance property in the third part of the theorem was given earlier in 
Mokken (1971) and Holland (1981). Esary, Proschan, and Walkup (1967) define 
the covariance property in the third part of the theorem as association between the 
underlying sets of random variables but gave no conditions under which the 
property of association holds. Ahmed, Leon, and Proschan (1981, sect. 3.5) derive 
association between <j>i and <J >2 under the same conditions as used here. Ellis 
(1993) proofs the third pari of the theorem to be valid for any subpopulation of 
examinees. An important reference is Rosenbaum (1984) who gives a version of 
the second pari of the theorem not based on the assumption on 93 made here. 
Finally, Junker (1993) establishes the first part of the theorem as the property of 
manifest homogeneity. 

All assumptions of Theorem 1 are assumed to hold in the next theorem. 
The use of double indices is only to refer to rows and columns in an item x person 
matrix with response data. The critical event in the theorem is the presence of 
more than one parameters needed to characterize the distributions of the 
variables. 

Theorem 2 . Let {Yjjlx^Xg}, i=1 n, j=l m, be independent and SO in x 1 and x 2 . 

P and Q are defined to be the sets of indices of two disjoint subsets of variables of 

{Yj.;1=1 n, j=1 m). Let <pp=<pp(.) and <pq=<pq(.) be two functions 

nondecreasing in each of the variables with indices in P and Q, respectively. It is 
assumed that <pq(.) is either dichotomous or nondecreasing in I(jj) e Qyjj with 
each y-- dichotomous. Finally, (X^x^ and (X 1 lx 2 ) are assumed to be SO in x 1 and 
x 2 , respectively. It holds that: 

(1) E(<}>k|x v ), K=P,Q, is a nondecreasing function of x , v=0,1; 
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(2) E(<t»p|x Vl (pQ) is a nondecreasing function of Xy, v=1,2, for all values of 
<PQ- 

Pr00< - The two parts of this theorem follow immediately from the previous lemmas: 

(1) Lemmas 13(1), 10, 12, and 2. 

(2) Theorem 2(1) and Lemma 13(4) (conditional independence). 

This theorem identifies the conditions under which order with respect to one 
parameter is maintained in a response model with more than one random person 
or a person and an item parameter. For example, the theorem implies that the 
expected sum of scores in any part of the data matrix is ordered in either 
parameter provided the parameters are independent or stochastically ordered 
themselves. The same feature holds for the expected column and row sums of the 
data matrix. The theorem thus reveals the conditions under which the row and 
column sums are ordered by a person and item parameter. Other consequences 
from the two theorems are presented in more detail in the corollaries in the next 
section. 



Applications to IRT 

As explained in the introduction, a dichotomous IRT model can be represented by 
a family of cdfs (F(U|I0: -°°<0<°o) fully determined by the probabilities 
{fj( 110): -oo<0<oo } modeled as a (strictly) increasing function of 0 (Lemma 6). 
Since (F(U| |0 )} is (strictly) decreasing in 0, this family is SO in 0. Also, because 
the response variables Uj are dichotomous, it holds that {f(ujl 0 )} has MLR in u f 
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w.r.t. 6. Finally, the usual assumption of local independence between response 
variables for different items guarantees the conditional independence required in 
some of the lemmas and the two theorems above. 

As already observed, both theorems involve several properties which can 
be observed in large samples of test data. The most important properties implied 
by Theorem 1 are summarized in Corollary 1. Some of these properties have also 
been listed elsewhere (see, for example, Rosenbaum, 1984, or Sijtsma & Junker, 
1994). 

Corollary 1 . For any dichotomous IRT model with a single ability parameter and a 
fixed test design it holds that: 

(1) conditional item n -values given the (number-right) score on another item 
or subtest are nondecreasing functions of the conditioning score; 

(2) item-rest regression, defined as the regression of an item score on the 
(number-right) score on the remaining items, is a nondecreasing function 
of the latter; 

(3) the probability of passing a cutoff score on a subtest is a nondecreasing 

function of the number-right score on another subtest; 

H L 

(4) if 7i- and n- are the n -values of item i in a high-scoring and low- 

HI 

scoring subpopulation, respectively, it holds that -n. is 

nonnegative; 

(5) all correlations between item score are nonnegative; 

(6) all item-rest correlations (item discrimination indices) are nonnegative; 

(7) all previous properties hold in any subpopulation defined by number-right 
scores on other items or subtests; 

(8) all previous properties hold for weighted scores, provided the weights are 
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nonnegative. 

Several of these properties do already have a long tradition as a criterion 
for item selection in classical item analysis. For example, attempts to maximize the 
internal consistency of a test have always been directed at removing items with 
negative intercorrelations and/or item-rest correlations from the test. In addition, 
the corollary confirms the status of D, typically defined using Kelley’s (1939) 27% 
rule, as a quick alternative to the item discrimination index which was popular in 
the pre-computer era. The notion that so-called formula scoring can be treated as 
equivalent to simple number-right scoring is another intuitive notion given a 
mathematical basis by the corollary. The corollary finally implies that classical item 
analysis is an effective first step to weed out items not fitting a dichotomous IRT 
model. 

Corollary 2 . In an IRT model, the properties of SO hold for a single item difficulty 
or ability parameter if: (1) the values of the item difficulty parameter are fixed; or 
(2) the values of the item difficulty parameters are random but all items are 
administered to the same examinees. 

In both test designs, the ability of the examinees and the item difficulty 
parameter are independent. Since independence implies that the distribution of 
one parameter is (not strictly) SO in the other, Theorem 2 holds. An example of 
the second design is a test sampled at random from an item pool and then 
administered to all of the examinees in the sample (domain-referenced testing). 

On the other hand, in adaptive testing, the assumption of independence 
between the parameters is unlikely to hold since adaptive procedures invariable 
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use item selection rules in which more able examinees tend to get more difficult 
items. This feature, however, suggests that the use of such rules may lead to 
distributions of the values of the difficulty parameter which are SO in the ability 
parameter. Suppose no constraints on the availability of the values of the item 
difficulty parameter exist in the item pool. The following corollary shows that the 
results in Theorem 2 apply to a currently popular procedure of adaptive testing: 

Corollary 3 . For an adaptive test from a 1 -PL item pool based on the maximum 
information principle in combination with EAP estimation of ability, the distribution 
of any monotonically nondecreasing function of the examinee’s response vector is 
SO in 6. 



The following argument explains the corollary. Let e k =-b k be the value 
of the easiness parameter of the kth item in the adaptive test. Then 

e|< = c( t (u 1 u k-1 ) = E(0|u 1 u k1 ). However, since {© | is SO in 

u 1 u k . 1 (Lemma 6), it follows that e k (u 1 u k <) is nondecreasing in each of its 

arguments. Because (U 1 U k . 1 |0 } is SO in 6, it follows that 

{e k (U 1 U k . 1 ) |0} is SO in 0 (Lemma 2). Note that Lemma 6 holds for any prior 

f(0). In a fully Bayesian procedure, the prior can thus be chosen to be 
independent of the one for the item parameter to allow us to ignore the items in 
the pool not used in the test (for this condition of independence, see Mislevy & 
Wu, 1988). 

The following corollary summarizes a result for IRT models with a two- 
dimensional ability structure: 
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Corollary 4 . The result in Theorem 2 holds marginally in e 1 if {©-j |0 2 } is a 
location family with conditional pdfs f(6 1 -ja |02 )• 

The fact that location families have the property of SO is well documented 
(e.g., Lehmann, 1986, p. 84-85). An important application is the case of a bivariate 
normal ability distribution with constant conditional variances. As the values of the 
ability parameters are not controlled by design, it is a matter of empirical fact 
whether or not the condition in this corollary holds satisfactorily in practice. A 
statistical test for this condition could be based on the class of models with 
multivariate ability presented in Glas (1992). 
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Table 1 

Numerical example showing that SO is not transitive 
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Table 2 

Numerical example showing that univariate 
SO does not imply multivariate SO 
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